Evidence-Based Medicine Supercharger

ABSTRACT

Apparatuses, computer media, and methods for supporting health needs of a consumer by processing input data. An integrated health management platform supports the management of healthcare by obtaining multi-dimensional input data for a consumer, determining a health-trajectory predictor from the multi-dimensional input data, identifying a target of opportunity for the consumer in accordance with the health-trajectory predictor, and offering the target of opportunity for the consumer. An outcome study for a medical treatment from a medical publication is detected. The medical treatment is mapped to a diagnostic and procedural code and a database for health data is accessed using the diagnostic and procedural code. An outcome metric for the medical treatment with a consumer group is associated, and a utility function is generated from the plurality of outcome metrics, where the utility function gauges an efficacy of at least one intervention channel for a consumer.

This application is a continuation of common-owned, co-pending U.S.application Ser. No. 11/612,763 (“Integrated Health ManagementPlatform”) filed on Dec. 19, 2006 naming David H. Kil, the entiredisclosure of which is hereby incorporated by reference.

FIELD OF THE INVENTION

This invention relates generally to healthcare management. Moreparticularly, the invention provides apparatuses, computer media, andmethods for supporting health needs of a consumer by processing inputdata.

BACKGROUND OF THE INVENTION

The U.S. healthcare industry is a $2T economy with the rate of growthfar exceeding that of general inflation. With the aging globalpopulation, the current healthcare crisis is expected to worsen,threatening the health of global economy. The existing healthcareecosystem is zero-sum. The recent pay-for-performance (P4P) experimentby the National Health Services in the United Kingdom resulted in mixedoutcomes with incentive-based payments far exceeding the budget withuncertain improvements in patient health. On the other hand, a recentstudy on the sophistication of healthcare consumers reveals that thereis little correlation between consumers' perception of care and theactual quality of healthcare delivered as measured by RAND's 236 qualityindicators. Furthermore, given the high churn rate and the propensity ofemployers to seek the lowest-cost health plan, payers are motivated tofocus primarily on reducing short-term cost and carving out thecream-of-the-crop population, resulting in perverse benefit design.

In healthcare, predictive models are used to improve underwritingaccuracies and to identify at-risk members for clinical programs, suchas various condition-centric disease management programs. Unfortunately,predictive models typically use year-1 payer claims data to predictyear-2 cost. Some predictive modeling vendors predict future inpatientor emergency-room episodes since they represent high-cost events. Theemphasis on cost makes sense given that the impetus for predictivemodels came from private and government payers struggling with risinghealthcare costs.

Evidence-based medicine (EBM) is an attempt to apply scientific evidenceto making care decisions for patients. A lot of EBM guidelines arederived from medical journals, where teams of researchers rely onrandomized controlled trials and observational studies to drawinferences on the efficacy of various treatments on carefully selectedpatient populations. Pharmacovigilance or study of adverse drugreactions is an example of EBM.

Current EBM vendors, such as Active Health Management, a wholly ownedsubsidiary of Aetna, and Resolution Health, rely on a team of physiciansreading and codifying relevant medical journals. The resulting EBMdatabase is applied to population claims data consisting of medical, Rx,and lab claims data in order to identify patients not receiving properEBM guidelines, i.e., with “gaps” in treatment. Physicians of theidentified patients are contacted through faxes or telephone calls withinstructions or recommendations on how to close the gaps in treatment. Anumber of shortcomings exist with the current EBM implementation. ManyEBM studies suffer from small sample size, thus making generalizationdifficult and sometimes inaccurate. A corollary of the first shortcomingis that most EBM studies are at a selected population level and do notprovide drilldown information at a sub-population level. That is, if noteveryone benefits from an EBM guideline, it may be dangerous to applythe guideline to the entire study population, which begs for a carefultradeoff between specificity and sensitivity. Guidelines typically do apoor job of translating study outcomes into metrics that endstakeholders care about. For example, payers pay a particular attentionto cost, which is not the same as improving surrogate endpoints that aretherapeutic in nature with various time frames for healing or outcomesimprovement. Publication bias and conflicting results encourage ad hocdecision making on the part of payers in the area of utilizationmanagement, such as coverage denials and medical necessity reviews.Furthermore, relying on published guidelines discourages the use ofautonomous or loosely guided search for anomalies or precursors toadverse outcomes using a large of amount of integrated data assets andintelligent search algorithms based on machine learning.

Clearly, there is a desperate need for an integrated solution forproviding healthcare management.

BRIEF SUMMARY OF THE INVENTION

The present invention provides apparatuses, computer media, and methodsfor supporting health needs of a consumer by processing input data.

With one aspect of the invention, an integrated health managementplatform supports the management of healthcare by obtainingmulti-dimensional input data for a consumer, determining ahealth-trajectory predictor from the multi-dimensional input data,identifying a target of opportunity for the consumer in accordance withthe health-trajectory predictor, and offering the target of opportunityfor the consumer. Multi-dimensional input data may include claim data,consumer behavior marketing data, self-reported data, and biometricdata.

With another aspect of the invention a consumer is assigned to a clusteror clusters based on the multi-dimensional input data. A characteristicof the consumer may be inferred from a subset of the multi-dimensionalinput data.

With another aspect of the invention, a cluster is associated with adisease progression, where the cluster is associated with at least oneattribute of a consumer. A target of opportunity is determined from thecluster and the disease progression. An impact of the target ofopportunity may be assessed by delivering treatment to a consumer at anappropriate time.

With another aspect of the invention, a target of opportunity isextracted from medical information using a set of rules for themulti-dimensional input data.

With another aspect of the invention, a previous event that occurredbefore a subsequent transition event is identified. A correlationbetween the previous event and the subsequent transition event ismeasured from historical data to assign multidimensional strength orutility indicators to a discovered rule.

With another aspect of the invention, an enrollment healthcare selectionfor the consumer is recommended based on multi-dimensional input data.

With another aspect of the invention, an outcome study for a medicaltreatment from a medical publication is detected. The medical treatmentis mapped to a diagnostic and procedural code and a database for healthdata is accessed using the diagnostic and procedural code. An outcomemetric for the medical treatment with a consumer group is associated,and a utility function is generated from the plurality of outcomemetrics, where the utility function gauges an efficacy of at least oneintervention channel for a consumer.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example and not limitedin the accompanying figures in which like reference numerals indicatesimilar elements and in which:

FIG. 1 shows an architecture of an integrated health management (IHM)platform in accordance with an embodiment of the invention.

FIG. 1A shows a Service Oriented Architecture (SOA) framework inaccordance with an embodiment of the invention.

FIG. 2 shows a method of determining multimode health-trajectorypredictors in accordance with an embodiment of the invention.

FIG. 3 shows a flow diagram for an evidence-based medicine superchargerin accordance with an embodiment of the invention.

FIG. 4 shows a flowchart for an autonomous healthcare data explorationsystem in accordance with an embodiment of the invention.

FIG. 5 shows an illustrative conceptual example of the optimal healthbenefit design in accordance with an embodiment of the invention.

FIG. 6 shows an example of Markov modeling of assessing a target ofopportunity in accordance with an embodiment of the invention.

FIG. 7 shows computer system 100 that supports an embodiment of theinvention.

DETAILED DESCRIPTION OF THE INVENTION Integrated Health ManagementPlatform in a Service Oriented Architecture (SOA) Framework

FIG. 1 shows an architecture of integrated health management (IHM)platform 100 in accordance with an embodiment of the invention. IHMplatform 100 creates for payers a virtuous circle of integratedinformatics leading to improved and real-time decision making leading tohealthier members leading to improved profitability and cost savingsleading to improved market share. For consumers who must share anincreasing burden of medical costs, the execution of IHM platform maylead to improved health and subsequent cost savings. In the followingdiscussion, a consumer may be an employee of a company or an individualchoosing a healthcare plan from a plurality of plans and consumingproducts/services from various healthcare stakeholders. An objective ofthe consumer is to maximize benefits associated with good health bychoosing a healthcare plan that is “best” for the individual or his/herfamily and improving health through timely preventive and proactivehealth actions.

The IHM platform consists of the following four components:

-   1. Multimode health-trajectory predictors module 101: Instead of    focusing on predicting future cost alone using claims data as most    predictive models do now, multimode health-trajectory predictors    leverage claims data 153, self-reported data 151, and consumer    behavior marketing data 155, coupled with inference engines 115, to    provide a comprehensive set of future attributes useful to assess    the level of impact through various consumer-engagement channels.    Claims data 153 may include medical claims, pharmacy claims, prior    authorization, and lab results (e.g., blood tests) for a consumer.    Consumer-engagement channels may encompass secure e-mails,    Interactive Voice Recording (IVR) calls, cellphone text messages,    and nurse calls. Data Merge & Cleaning 109 performs    extract-transform-load (ETL) of disparate data assets to form a    consumer-centric view while cleaning data prior to weak-signal    transformation through digital signal processing (DSP) and feature    extraction 111. Disease clustering and progression module 113    subsequently forms disease clusters and estimates disease    progression probabilities. Clustering optimization & inference 115    performs clustering using attributes that are meaningful from the    perspective of predicting future health trajectories and    impactability with the inference engine filling in unobserved    variables using the instantiated variables. A modular predictive    model is developed for each consumer cluster so that a collection of    locally optimized predictive models can provide a globally optimal    performance 117. Finally, a set of health scores encompassing health    scores, behavior/lifestyle scores, engagement scores, impact scores,    data-conflict scores, cost scores, and clinical scores is output    119.-   2. Targets-of-opportunity finder 103: Leveraging    consumer-understanding technologies, an evidence-based-medicine    (EBM) supercharger (shown as EBM supercharger 300 in FIG. 3), and an    autonomous insight crawler, one can identify targets of    opportunities in various consumer touch points. The four major    opportunities lie in clinical gaps 121, treatment adherence 123,    lifestyle/behavior 125, and psychosocial parameters 126. Impact    assessment is made based on the aggregate future impact of all the    identified targets of opportunities 127.-   3. Resource-allocation manager 105: Resource-allocation manager    (RAM) 105 funnels the right members to the right consumer touch    points at the right time by maximizing multi-objective functions.    Also included in RAM 105 are consumer-understanding technologies and    iterative benefit design borrowing salient concepts from adaptive    conjoint analysis, predictive modeling, and Pareto multi-objective    optimization. Furthermore, mixing-in currently available    technologies into consumer touch points in conjunction with dynamic    progressive content tailoring allows one to go beyond the typical    nurse-based care model, which is inherently not scalable especially    with the projected worsening nurse shortage in the labor market.    (Resource-allocation manager 105 is Pareto efficient if no consumer    can be made better off without another consumer being made worse    off.) The fundamental idea here is building a multi-objective    constrained optimization engine 137 as a function of consumer,    intervention-channel, benefit-program profiles (129, 131, 135) and    utility functions 133 derived from the impact analysis engine.-   4. Impact-analysis engine 107: This module tells one what works for    which population segments, by how much, and why in a drilldown mode.    It facilitates the use of utility functions in the framework of    resource-allocation optimization as done in defense battlefield    resource management. The methodology employed uses predictive    modeling, combinatorial and stochastic feature optimization with    respect to outcomes, and propensity-score shaping. After selecting    candidate population for analysis 141, one performs thorough    matching in the two-dimensional space of propensity and predictive    scores 143 to create control and intervention groups 145 for an    “apple-to-apple comparison.” One then create rules of engagement for    statistically significant outcomes, which are further validated    through focus-group study 149 and survey using the minimum number of    necessary questions 148. Validated rules 150 are stored in the    master rules database for production implementation.    -   The four above components 101-107 complement one another and are        ideally suited to assessing the incremental benefits of bringing        new data assets and business processes into enterprise        operations. In order to facilitate integration into and        compatibility with typical payer enterprise applications, the        IHM implementation (e.g., IMH platform 100) adheres to an        enhanced Service Oriented Architecture (SOA) framework. A key        idea here is maximizing synergy among business process        primitives, data models, and algorithm models so that one can        reduce latency between the generation of actionable knowledge        and its production implementation.

FIG. 1A shows Service Oriented Architecture (SOA) framework 160 of IHMplatform 100 in accordance with an embodiment of the invention.Framework 160 increases synergy in data models, mathematical models, andbusiness-process models that are important in ensuring the success ofIHM Platform 100. Inputs 161 consist of data library 163, algorithmlibrary 165, and business-process libraries 167, which get updated withthe latest discoveries. The processing layer uses the building blocks ofbusiness processes and algorithms tailored to underlying data models toproduce intermediate processing outputs as well as actionable insightsthat feed to multimedia outputs for dissemination to the keystakeholders.

Data library 163 includes Consumer Touch Points (CTP) 169, UtilizationManagement (UM) 171, Underwriting Questionnaire (UWQ) 173, D&B: Dun &Bradstreet (D&B) database 175, Electronic Medical Records (EMR) 177, andHealth Risk Assessment (HRA) database 179.

Multimode Health-Trajectory Predictors

FIG. 2 shows process 200 for determining multimode health-trajectorypredictors in accordance with an embodiment of the invention. Instead offocusing on cost prediction, multimode health-trajectory predictorsattempt to understand current and predict transitions in Bayesianrelationships among the many semi-orthogonal outcomes attributes so thatone can maximize positive impact through delivering the rightintervention touch points to the right consumers at the right timebefore adverse transitions occur.

In healthcare, predictive models are used to improve underwritingaccuracies and to identify at-risk members for clinical programs, suchas various condition-centric disease management programs. Typically,prior art predictive modes predict year-2 cost using year-1 payer claimsdata. Some prior art predictive modeling vendors predict futureinpatient or emergency-room episodes since they represent high-costevents. The emphasis on cost makes sense given that the impetus forpredictive models came from private and government payers strugglingwith rising healthcare costs.

Focusing on cost alone ignores the complex, multifaceted nature ofhealthcare consumers. Knowing future cost with R-sq of 10-25% isdifferent from being able to impact the future health trajectory of eachconsumer. For example, it may be more beneficial to touch John sufferingfrom pre-diabetic conditions with body mass index (BMI) of 32 than tointervene on behalf of Mark who has to go through kidney dialysis threetimes a week because of end-stage renal disease. From a costperspective, Mark may be 20-40 times more expensive. But from an impactperspective, John would be a better candidate because his currentconditions are more amenable to actions that can be taken now to preventunpleasant consequences in the near future.

As a result of cost emphasis, prior art predictive models extract astandard set of features from Rx and/or medical claims data and applylinear regression or neural networks to predict year-2 cost. Typicalfeatures include disease flags and year-1 summary cost and utilizationstatistics, such as average inpatient cost per month, average Rx costper month, # of physician visits per month, etc. Some predictive modelsdivide the population into sub-groups using inputs from clinicians withthe goal of designing a model tailored to each sub-group (MedAI).However, it may be quite difficult to design optimal clusters given thecomplexities of and interplays among the many factors that determinefuture health trajectories.

In order to address the shortcomings of the current generation ofpredictive models, an embodiment of the invention incorporates thefollowing concepts:

-   1. Use of input data such as claims data 251, self-reported data    255, consumer behavior marketing (CBM) data 253, and biometric data    257 is augmented with inference engine 209 to predict multiple    semi-orthogonal attributes with the goal of finding the best way to    engage and motivate healthcare consumers to create positive impact.    Input data is typically provided by electronic health record (EHR)    database 203. Not everyone will have all the data assets. Therefore,    key unknown variables need to be estimated using inference engine    209.-   2. Flexible-dimension clustering process 211 creates an optimal set    of consumer clusters from an impact perspective instead of using the    same old disease hierarchy to create disease-centric consumer    clusters.-   3. Adaptive hypermedia content creation 221 leverages a    comprehensive understanding of consumer needs and how to best    provide a positive impact.

As shown in FIG. 2, inputs include:

-   -   Claims data 251: It is comprised of Rx/med/lab data,        utilization-management (UM) data including pre-authorization,        Rx/med benefit data, program touch-point data, Web log data, and        limited member demographic data.    -   Consumer behavior marketing (CBM) data 253: This externally        purchasable data provides inferred behavior, lifestyle, and        attitudinal information on consumers from their demographic data        and credit history.    -   Self-reported data 255: This includes health risk assessment        (HRA), ecological momentary assessment (EMA), and experience        sampling method (EMA) data administered through multiple        communication channels, such as the Internet, cellphone, set top        box, etc.    -   Biometric data 257: This encompasses data from wearable sensors        (Bodymedia's BodyBugg™, Nike+ shoe sensor, polar band) and        attachable sensors (glucometer, blood-pressure cuff, spirometer,        etc.) transmitted through wired or wireless networks.

As shown in FIG. 2, processing includes:

-   -   Mixer 201: Not everyone will have all the data elements.        Therefore, mixer 201 organizes incoming data into a schema        appropriate for frame-based dynamical data processing.        Furthermore, it differentiates between 0 and an empty set φ.    -   Preprocessing 205: This step performs secondary data audit and        consumer-centric data structure generation. Primary data audit        occurs during data creation in enterprise data warehouse (EDW).        -   1) Data audit: Outliers are normalized using multi-pass            peak-shearing. Multiple debit/credit entries and ghost            claims are eliminated. It looks for potential gender/age            mismatch errors (grandmother or father giving birth to a            baby or a premature baby's neonatal claims being assigned to            his or her parents) using a look-up table.        -   2) Consumer-centric data structure generation: For each            consumer, we create an efficient data structure from memory            and processing perspectives. It is a hierarchical structure            encompassing the entire consumer touch-point suite of            channels.    -   Transform 207: This step creates various bandpass-filtered maps        over time. For instance, International Classification of Disease        (ICD) 9/10 codes from medical claims and National Drug Codes        (NDC) from Rx claims are converted into hierarchical        condition-versus-time maps to facilitate the analysis of disease        progression and the creation of disease clusters. Moreover, such        a representation can help one to infer behavioral patterns from        linking discrete events or following medication adherence for        managing chronic conditions. A combination of ICD and Current        Procedure Terminology (CPT) codes is used to derive Milliman &        Robertson (M&R) categories over time, which is useful in        assessing the utilization of various service types (inpatient,        outpatient, emergency room, physician office visit, etc.) over        time. Biometric data is processed through a large number of        transformation algorithms, such as the fast Fourier transform,        wavelet transform, local cosine transform, ensemble interval        histogram, etc., in order to glean locally dynamic behaviors        over time. Due to the infrequent nature of HRA and CBM data        (i.e., people do not change their behavior or lifestyle every        hour), locally dynamic behaviors serve as anchor points that        vary much more slowly so that one can investigate the cumulative        effects of linked local events over time on behavior change. The        entire transformation process is analogous to multi-rate signal        processing. At the end of transform, we extract a large number        of static and dynamic features from each transformation space,        as well as higher-order linked attributes spanning multiple        transformation spaces in order to glean insights into disease        clustering, disease progression, and their interplay with the        consumer's psychosocial behavioral traits.    -   Inference engine 209: Knowing certain unobserved traits can be        quite useful in devising tailored intervention strategies. Let        x_(claims), x_(CBM), x_(SR), and x_(bio) represent the four data        sets as previously discussed. If knowing one's body mass index        (BMI) is desirable, one first builds modular predictive models        from the sub-population that has BMI data such that        P(BMI|x_(claims)), P(BMI|x_(CBM)), etc. constitute a feasible        set of models for predicting BMI conditioned upon having other        data assets. This model can be in the form of Bayesian networks,        regression or classification algorithms leveraging parametric        and non-parametric learning algorithms.    -   Flexible-dimension clustering 211: This is an iterative process        leveraging multiple fitness functions and predictive models as        part of clustering. This step generates a set of clusters for        each outcomes variable such that the output dispersion        compression is maximized for improved prediction accuracy.        -   1) For each outcomes variable, one performs feature            optimization to find a sufficient-statistics feature subset.        -   2) One performs clustering using k-means,            expectation-maximization (EM), and Kohonen's self-organizing            feature map. After clustering, there are N_(C) clusters for            each outcomes variable. For each cluster, one calculates the            dispersion σ_(i), i=1, . . . , N_(C) of each of the outcomes            distributions and compare it with the overall dispersion            σ_(T) from the entire population. The dispersion-compression            ratio (DCR) r_(i)=σ_(T)/σ_(i)>γ, where γ>1, is a            predetermined dispersion-compression threshold for accepting            the i^(th) cluster based on its ability to compress the            outcomes distribution. One creates a set of samples that            pass the DCR test.        -   3) For the samples that do not pass the first DCR test,            repeat steps 1-2 until there is no sample left or the number            of remaining samples is less than the minimum sample size.    -   Automatic model calibration 215: In real-world problems, data        characteristics remain rarely stationary over time. With process        200, step 213 determines whether training is needed to update        process 200 for new medical developments. For example,        introduction of new medical technologies and drugs, changes in        benefit plans and fee-reimbursement schedules, changing        demographics, and even macroeconomic cycles can affect data        characteristics. Built-into the automatic model calibration        algorithm 215 is a data-mismatch estimator that keeps track of        statistical parameterization of key data assets over overlapping        time frames and consumer clusters after removing secular trends,        e.g., medical-cost inflation. Model parameters are updated and        stored in model parameters database 217. During model        initialization and subsequent re-calibration, the following        takes place:        -   1) Perform preprocessing step 205, transform step 207,            inferring step 209, and flexible dimension clustering step            211        -   2) Feature optimization for each consumer cluster and            outcomes variable using combinatorial and stochastic            algorithms        -   3) Model performance tuning to find the point of diminishing            returns        -   4) Multiple-model combining    -   Multiple-model scoring 219: Once process 200 has been trained,        multiple-model scoring 219 is performed for input data 251-257.        One generates the following health scores:        -   1) Health scores as a function of current chronic conditions            and predicted disease progression        -   2) Behavior and lifestyle scores computed heuristically as a            function of reported, observed (medication adherence,            frequent ER visits, the level of interaction with            care-management nurses, etc.), and inferred behavior and            lifestyle attributes        -   3) Engagement scores as a function of reported, observed,            and inferred psychosocial and collaborative-filtering            parameters        -   4) Impact scores working in concert with            evidence-based-medicine (EBM) supercharger 300 and utility            functions associated with targets of opportunities and            derived from the impact-analysis engine        -   5) Conflict scores as a function of discrepancies between            reported and observed behavioral/lifestyle factors and            claims data        -   6) Cost scores for multiple future time periods in chronic            vs. acute categories        -   7) Clinical utilization scores in terms of inpatient,            emergency room/urgent care centers, medication, etc.    -   Adaptive hypermedia content generation 221: This module        generates a tailored report of 1-2 pages succinctly summarizing        current health conditions, likely future states, targets of        opportunities, action plan, and benefits with drilldown menu.

In accordance with an embodiment of the invention, process 200 providesoutputs including:

-   -   Consumer-centric metadata for a comprehensive view (both tabular        and scientific visualization) of each consumer with appendages        linking consumer-centric metadata to various stakeholders to        facilitate stakeholder-centric data transformation    -   Health scores    -   Adaptive hypermedia content tailored to each consumer

Evidence-Based Medicine Supercharger

FIG. 3 shows a flow diagram for evidence-based medicine (EBM)supercharger 300 in accordance with an embodiment of the invention. Froman EBM guideline or a medical journal article 351,evidence-based-medicine supercharger 300 generates a set ofmultidimensional inferred and observed utility functions, which is anessential ingredient in developing optimal resource allocationstrategies. The utility function can be multidimensional at multiplelevels of granularity in terms of patient or consumer clusters, leadingto an M×N matrix, where M and N represent the number of utilitycomponents or objectives and the number of consumer clusters,respectively._For example, consumer clusters generated from thehealth-trajectory predictors may encompass the following groups: (1)those who are generally healthy from a claims perspective, but exhibitpoor health habits in terms of high BMI and “couch-potato”characteristics; (2) those who suffer from chronic illnesses amenablefrom a lifestyle intervention, such as diabetes and cardiovasculardisease; (3) people who have multiple co-morbid conditions, but onecannot find treatment-related claims records (N=3). From a segmenteddrilldown impact analyses of three intervention channels (InteractiveVoice Response (IVR), health behavior coaching, and case management(M=3)), one determines that the most effective intervention channels forthe three population clusters are (IVR, health behavior coaching),(health behavior coaching), (case management and health behaviorcoaching), respectively. The utility function is a 3×3 matrix, whereeach element x_(ij) contains a utility score or return on investment forthe i^(th) intervention channel applied to the j^(th) consumer cluster.

In accordance with an embodiment of the invention,evidence-based-medicine supercharger 300 includes:

-   -   Input databases:        -   1) EBM database 317: It consists of EBM rules, taxonomy for            inducing rule parameters from medical journal, population            parameters, rule strength, mapping look-up tables that map            condition and drug names to ICD-9 and NDC, respectively, and            utility function. Population parameters encompass inclusion            and exclusion criteria. Rule strength is a function of            publication rank using a page-ranking algorithm, author            prestige based on the number of connections in the            publication network, journal prestige based on the number of            circulation, sample size, percentage of total cost affected,            longitudinal duration, and the number of corroborating            articles. The EBM taxonomy facilitates efficient induction            of EBM-rule parameters from an exemplary journal abstract as            shown in the Appendix. More algorithmic details will be            discussed in the processing-algorithm subsection.        -   2) Electronic Health Records (EHR) 203: This database            contains claims data 251, self-reported data 255, and            consumer behavior marketing (CBM) data 253.    -   Processing algorithms        -   Text mining 301: The Appendix shows a semi-structured            abstract from an article published in the New England            Journal of Medicine. Instead of using a bag-of-words or            natural-language-processing feature vector and a Naïve Bayes            classifier to rank an abstract, one simply detects whether            an abstract reports an outcomes study or not. This is a much            easier problem and defers the strength-of-evidence            classification until after integrated outcomes analysis.            Next, one uses a combination of key words, tf*idf text            weights (in which the importance of a word is based on its            frequency of occurrence in a document and normalized by its            natural frequency of occurrence in a corpus) with stemming            and stop words, and distance measures from key words to fill            in the hierarchical tree EBM database fields in the areas            of:            -   1) Type of outcomes research            -   2) Patient characteristics: size, dropout rate (if                available), characteristics in terms of inclusion and                exclusion criteria, longitudinal duration, and trigger                criteria            -   3) Reported results        -    The distance measures are necessary to leverage lexical            analysis to understand higher-level relations and concepts            between words in a sentence or a paragraph.        -   Automatic EBM rule induction 303: Given the EBM database            fields extracted from a medical journal, one uses secondary            look-up tables to map drug names, diagnoses, and procedures            onto NDC, ICD-9, CPT-4, and laboratory codes commonly used            in claims-payment systems.        -   Human-Computer Interface (HCI) for human confirmation 305:            The induced EBM rule along with the highlighted abstract is            presented to a clinician for final confirmation with or            without edit.        -   EBM population identification 307: One identifies potential            control and intervention populations using the inclusion,            exclusion, and trigger criteria. The presence or absence of            the trigger criteria assigns a patient to the intervention            or control group, respectively, provided that the patient            satisfies the inclusion and exclusion criteria.        -   Dual-space clustering 309: This step creates meaningful            consumer clusters that are homogeneous in the optimized            baseline-period-attribute-and-outcomes (y) vector space. The            baseline period equals the pre-intervention period of a            fixed duration            -   1) For each EBM guideline, one builds models that                predict various outcomes metrics. Associated with each                predictive model is an optimal feature subset (XεR^(N),                where N is the optimal feature dimension) derived from a                combination of stochastic and combinatorial optimization                algorithms.            -   2) In the vector space spanned by X, one performs                clustering using k-means, expectation-maximization (EM),                and Kohonen's self-organizing feature map algorithms.                After clustering, there are N_(C) clusters. For each                cluster, one calculates the dispersion σ_(i), i=1, . . .                , N_(C) of each of the outcomes distributions and                compare it with the overall dispersion σ_(T) from the                entire population. The dispersion-compression ratio                (DCR) r_(i)=σ_(T)/σ_(i)>γ, where γ>1, is a predetermined                dispersion-compression threshold for accepting the                i^(th) cluster based on its ability to compress the                outcomes distribution for more precision in applying EBM                from an outcomes perspective. One creates a set of                accepted samples for which clusters in X are                sufficiently precise for performing integrated outcomes                analysis. One selects the clustering algorithm that                provides the highest DCR.            -   3) For the remaining population samples, perform feature                optimization to derive a new optimal feature subset                X^((k)). Compress X^((k)) into X_(c)                (dim(X_(c))<<dim(X^((k)))) using linear discriminant                analysis (LDA) and discretized outcomes metrics should                they be continuous. Next, perform clustering in the                vector space spanned by X_(c) and y. Prior to                clustering, normalize the vector space so that mean and                standard deviation of each component will be 0 and 1,                respectively. The standard deviation of y can be higher                to reflect its importance in determining clusters. Keep                the clusters whose DCRs>1.            -   4) For the remaining clusters, repeat step iii until the                number of remaining samples is below the minimum                threshold, i.e., (k)→(k+1). The final remaining samples                represent the final cluster.        -   Integrated outcomes analysis 313: For each cluster, perform            case-controlled impact analysis leveraging predictive and            propensity-score models to account for both regression to            the mean and selection bias. A comprehensive set of outcomes            metrics encompasses both observed and inferred variables.            For the inferred variables, we estimate individual and            cluster prediction accuracies so that we can assess the            level of statistical significance as a function of cluster            size and model accuracy.        -   Utility function generation 315: Finally we generate a set            of utility functions.            -   1) Two-dimensional marginal utility functions over                individual outcomes metrics and population clusters            -   2) One-dimensional utility function over a composite                outcomes metric with weights            -   3) Pareto Frontier set for multiple outcomes metrics                based on a user-defined multi-objective function

Outputs of evidence-based-medicine supercharger 300 include:

-   -   Utility functions tailored to each stakeholder, a composite        outcomes metric, or multi-objective optimization or        Pareto-efficient plots    -   Outcomes metrics

Autonomous Healthcare Data Exploration System

FIG. 4 shows a flowchart for autonomous healthcare data explorationsystem 400 in accordance with an embodiment of the invention. Autonomoushealthcare data exploration system 400 explores healthcare database tolook for “interesting” relationships autonomously using various signalprocessing and data mining algorithms. There is often substantial hiddeninsight in healthcare data that can be discovered. Autonomous dataexploration is sometimes associated with fraud detection. In healthcare,gaming or exploitation of loopholes in fee-reimbursement policies can bea serious problem, which has led to utilization management or medicalnecessity review by payers. For example, one study reports that 39% ofphysicians surveyed use at least one of the following three gamingmethods:

1. Exaggerating the severity of patients' conditions 2. Changingpatients' billing diagnoses 3. Reporting signs or symptoms that patientsdidn't have

Fraud detection has been around for over two decades in a myriad offorms. It typically looks for outliers or uses models learned fromlabeled training data to identify suspicious activities for humanconfirmation. The two most widely used areas are in credit-card andfinancial industries. The U.S. Securities and Exchange Commission (SEC)and research boutique firms pore through tick-by-tick financial data tolook for anomalous trading patterns that can signal insider trading.

Just to illustrate the difficulty of transitioning commercial antifraudsolutions to healthcare, the U.S. Government Accountability Officereports that instead of adopting commercially available antifraudsoftware to Medicare use, the Health Care Financing Administration(HCFA) chose to enter into a multi-year agreement with the Los AlamosNational Laboratory, citing numerous difficulties with adoptingcommercial software. Unfortunately, no such software—commercial orcustom-built—is in widespread use today.

The focus on fraud pits one stakeholder against another when outrightfraud is relatively rare, and a soft form of exploiting system loopholesis more common in healthcare. Therefore, there is a need for a moresophisticated and less demeaning system focused on learning hiddencausal relations between treatment and health outcomes (both positiveand negative) so as to gain the widest possible acceptance from all thestakeholders.

FIG. 4 shows the flowchart of autonomous healthcare data explorationsystem 400, which leverages multimode health-trajectory predictors alongwith a consumer-centric database 401. Autonomous healthcare dataexploration system 400 includes the following components:

-   -   Inputs        -   Consumer-centric database (CCDB) 401 consisting of            membership, benefit-plan history, consumer-touch-point            history, claims, self-reported, consumer behavior marketing,            provider, and evidence-based medicine data        -   Autonomous knowledge database, which is empty in the            beginning, but will be populated with new and iteratively            refined knowledge    -   Processing        -   Projection 403: This step creates multiple projections of            CCDB 401 over time so that one has a complete view of all            that's happening to each consumer conditioned upon            slowly-changing lifestyle, behavior, and psychographic            parameters.        -   Overlapped frame feature extraction 405: From each time            frame of each projection space, one extracts an appropriate            number of summarization and dynamic features so that we can            track their trajectories over time.        -   Multimode health-trajectory predictors 407: Predictors 407            predict future states of one's health around disease            progression, engagement, and impact.        -   Past-future dynamic clustering 409: Clustering is performed            on the vector space spanned by the current set of features            and predicted attributes. In one embodiment of such a            system, the current set of features encompasses the            parameterization of current disease conditions, utilization            of medical resources, and lifestyle/health behavior.            Predicted attributes may include disease progression, the            level of impactability, and future cost. The key idea is to            cluster consumers based on both where they are today and            where they are likely to transition to in the future.        -   Anomalous cluster detection and merging 411: Within each            homogeneous cluster, one looks for outliers in joint and            marginal spaces. Depending on the outlier-population size            derived from each cluster, one merges outliers from multiple            similar clusters to improve statistical power and            significance.        -   Outcomes analysis 413: For each outlier cluster, one looks            for attributes with commonality and differences between            outliers and normal cases. This search for common and            uncommon attributes facilitates case-controlled outcomes            analysis with drilldown along with the understanding of            factors responsible for differences in outcomes.        -   Causal pathway analysis 415: For each anomaly case            identified, one uses a structural learning algorithm to            induce a Bayesian network structure. Next, one ensures that            causal parameters between control and test groups move in a            logical way.        -   GUI for human confirmation 419: Each discovered knowledge is            presented to a human expert for final confirmation and            inclusion into the autonomous knowledge discovery database            417.    -   Outputs provided by autonomous healthcare data exploration        system 400 include:        -   Extracted knowledge

Intelligent Health Benefit Design System

FIG. 5 shows an illustrative conceptual example of the optimal healthbenefit design in accordance with an embodiment of the invention. Anintelligent benefit design system leverages ideas fromconsumer-understanding technologies, predictive modeling, impactanalysis, and multi-objective optimization to design an individuallytailored benefit product that balances the conflicting needs of moralhazard and social insurance by finding the acceptable ratio ofprofitability to subsidization for each product or plan configuration ina product bundle.

Element 515 in FIG. 5 shows a simplified two-dimensional efficientfrontier in the two-dimensional space of premium and out-of-pocket (OOP)cost with an indifference curve. That is, higher premiums are generallyassociated with lower OOP costs and vice versa. An insurance companystarts out with an initial set of product bundles 501. If the companyintroduces a new product for which no prior enrollment data isavailable, then the product enrollment distribution is estimated usingadaptive conjoint learning and prediction 503. On the other hand, ifproduct changes are evolutionary, then one can use prior enrollment datato develop and deploy predictive models to estimate the new productenrollment distribution given an initial set of product attributes 505.As part of designing an adaptive conjoint analysis (ACA) questionnaire,one leverages consumer marketing database or demographic database fromthe U.S. Census Bureau so that the questionnaire can be tailored to eachconsumer 507, 509.

The fundamental idea is to iterate the process of adjusting productattributes, estimating product enrollment distributions, and calculatingeconomic parameters (projected profit/loss as well as the level ofsubsidization inherent in a medical insurance product) of each productbundle so that we achieve an acceptable trade off between socialinsurance and moral hazard. That is, while the young and healthy aresupposed to subsidize the cost of insurance for the old and sick, thereneeds to be an element of personal responsibility in benefit design sothat people with poor health habits and beneficiary mentality do notabuse the entire healthcare system to the detriment of all 511, 513. Inshort, benefit design must deal effectively with risk factors that canbe mitigated within socially acceptable means. The plot labeled 517shows the relationship between individual prediction accuracy measuredin R-sq or R² and group prediction accuracy measured in predictive ratio(PR) mean (μ) and standard deviation (σ). Individual predictive accuracybecomes less important as group size increases as in employer or groupunderwriting. However, in clinical settings and predicting benefitenrollment, where adverse selection can occur frequently, individualpredictive accuracy is of paramount importance.

In healthcare, benefit design, according to prior art, is typicallycarried out by linking historical utilization and cost data to variousbenefit parameters, such as co-pay, deductible, co-insurance, maximumout-of-pocket, limits on Health Savings Account/Flexible SpendingAccount (HSA/FSA), etc. Then a loading factor (margin) is computed foreach plan design, which sets the premium for the plan. Depending on thepremium differential between plans, subsidization factors are calculatedsuch that a plan attractive to predominantly the healthy(high-deductible plans) may subsidize the cost of another plan thatappeals primarily to the sick so that the concept of social insurancecan be preserved in plan design.

An important consideration in benefit design is risk management. Ifbenefit parameters are particularly attractive to a certain segment ofpopulation whose medical needs differ significantly from those of thegeneral population, then such a plan has a high likelihood of attractinga biased population, which can lead to unexpected profit or lossdepending on the direction of the bias. Unfortunately for healthinsurance companies, this phenomenon of biased population (called anti-or adverse selection) is not uncommon. The result is a cookie-cutterbenefit design with a small number of selections so that the law oflarge numbers dominates the field.

More recently under the banner of consumer-directed health plan (CDHP),many payers started introducing high-deductible, low-premium plans. Thetheory of the case for CDHP is that high-deductible plans with some formof medical savings account will turn beneficiary-mentality patients intosophisticated healthcare consumers. Unlike other consumer industries,healthcare consumers may have hard time correlating actual high-qualitycare with a perceived one of at least based on RAND's quality metrics.Furthermore, the initial thrust of CDHP was to attract thecream-of-the-crop population from employers offering plans from multiplepayers. That is, nimble new-to-the-market payers introduced CDHPproducts to employers desperate to cut soaring health benefit costs. Theend result was that dinosaur payers were saddled with the undesirablesegment of the population, hurting their bottom line.

Studies suggest that while the young and healthy are potential winnersof CDHP, their opportunities for savings are limited because ofrestrictions in plan design, such as portability and investment. Resultsof post-CDHP health-resource utilizations and costs suggest mixedresults with no clear trend. Perhaps mixed results are not surprisinggiven the ambiguity of the theory of the case.

Perhaps the biggest shortcoming of the current health plan design isthat few incorporate innovative design parameters, such asconsumer-engagement strategies, incentives for lifestyle changes, andfun aspects in linking validated evidence-based-medicine guidelines,nutrition and exercise to health. Our design approach leverages theestimation of a consumer-preference function and projected utilityfunctions derived from the impact analysis engine to move away from acookie-cutter design and towards a tailored plan design that impactshealth behavior change.

For new product launch 501, one first proceeds with adaptive conjointquestionnaire (ACQ) 503 that is designed to minimize the number ofquestions leveraging predictive questionnaire construction. From ACQresponses, one can estimate a consumer preference function at anindividual level. From a pool of initial product bundles with presetfeatures, one estimates the overall enrollment distribution for a group(i.e., an employer). From the overall enrollment distribution and theoutputs of multimode health-trajectory predictors, one computesprofit/loss for each product and generate a three-dimensional picture ofprofit/loss and compressed two-dimensional objectives (i.e., minimizepremium and out-of-pocket or OOP expense) as shown in relationships 515and 517. This picture will provide visual insights to facilitate theunderstanding of Pareto-efficient design parameters, which can lead tothe reconfiguration of product features. This process of enrollmentprediction and product reconfiguration is iterative until theincremental change in product-feature reconfiguration is below anacceptable threshold.

After the product launch, one starts with a fresh data set, whichrepresents the actual product selection behavior by consumers. Unlike inconjoint analysis, one does not have information on exactly whichproducts consumers traded off before making product-selection decisions.One has the following information on consumers and theirproduct-selection behavior:

1. Demographics and behavior marketing (x_(demo), x_(cbm)) 2. Priorproduct selection (x_(pps)), which doesn't exist for new consumers 3.Current product selection

The task at hand is to estimate a revised consumer preference functionusing real data. Let y and w denote the product-selection behavior andproduct features, respectively. Then, the estimation task is as follows:

ŷ=ƒ(x _(demo) ,x _(cbm) ,x _(pps) ,w,D(w,y)),

where D(w,y) is a distance function between w and y, and ƒ(•) can beestimated using parametric or nonparametric learning algorithms. Anydifferences between the conjoint and real-data models are stored in adatabase for continuous model adaptation and learning. More complexdesign with incentives requires utility functions associated withincentives from the impact-analysis engine. After estimating theconsumer-preference function, there is a secondary step of identifyingintervention opportunities given the characteristics of consumerschoosing each product bundle. Based on utility functions and the outputsof the multimode health-trajectory predictors, the remaining task is todesign an incentive program within each product bundle that willencourage high-risk members to participate in the program.

FIG. 6 shows an example of Markov modeling of assessing a target ofopportunity in accordance with an embodiment of the invention. Markovmodel 600 shows a disease progression related to diabetes. Markov model600 shows the probability of transitioning from one disease state toanother disease state based on whether the consumer obtains a prescribedtreatment. Additionally, disease states may depend on observedbehavioral/lifestyle factors including the attributes of the consumer.Attributes may include the category of life style (e.g., “coach-potato”)and level of education of the consumer. The type of treatment and theefficacy of the treatment may depend on the consumer's attributes.

With state 601, a consumer, who is a “couch-potato,” is determined to bea pre-diabetic. As determined by intervention opportunity finder 103 (asshown in FIG. 1), there is a probability p_(1a) 609 a of the consumerbecoming a diabetic (state 603) without any treatment and a probabilityp₁ 609 b if the consumer received a prescribed treatment (treatment_1).For example, EBM supercharger 300 may determine that the consumer cansubstantially reduce the probability of becoming a diabetic with aproper diet and exercise regime under the supervision of a dieticianand/or exercise coach.

When the consumer becomes a diabetic, there is a probability ofdeveloping coronary artery disease (corresponding to state 605). Thecorresponding treatment_2 (as determined by EBM supercharger) may bemore radical than treatments. For example, treatment_1 may include oneor more prescribed drugs that are typically more costly than providing adietician and/or exercise coach. (In general, as a disease progresses,the associated costs increase.) The probability of a diabetic developingcoronary arterial disease without treatment is p_(2a) 611 a and p₂ 611 bwith treatment.

In accordance with Markov model 600, once a consumer has developedcoronary arterial disease, the consumer may further develop renalfailure and/or congestive heart failure (state 607). The probability ofdeveloping renal failure/congestive failure is p_(3a) 113 a withouttreatment is and p₃ 611 b with treatment.

Markov model 600 may include states based on different attributes of aconsumer. For example, state 615 is associated with the consumer havinga physically active life style. Consequently, the transition probabilityof disease progression is typically smaller than a consumer having has asedentary lifestyle (corresponding to state 601, in which a consumer isclassified as a “coach-potato).

Exemplary Scenario

Sarah is a 45-year-old mother of two children, overweight, pre-diabetic,being treated for hypertension and hyperlipidemia. At work, she needs toenroll in a health benefit plan since her employer switched to a newpayer, Global Health. In accordance with an embodiment of the invention,the following scenario that a consumer experiences.

-   -   Enrollment: Sarah is first given a combination of Predictive        Health Risk Assessment (PHRA) interspersed with Adaptive        Conjoint Analysis (ACA) questions. Even without single claims,        PHRA calculates future health trajectories and guides Sarah        through the benefit selection process based on an adaptive        questionnaire tree designed to minimize the number of questions        while maximizing predictive accuracy. She ends up selecting an        HMO plan with various incentives for staying healthy. Impact        analysis engine provided ROI's associated with incentives for        consumers who fit Sarah's profile. She is given an instant        analysis of her current health, likely health trajectories, and        what she can do to prevent unpleasant outcomes. An interactive        goal setting wraps up her first-day consumer experience with GH.        Health trajectory predictors are based on PHRA/ACA questions, in        which the optimal benefit design is part of resource allocation        management (RAM). (With prior art, Sarah is typically given a        list of traditional HMO, PPO, and Indemnity plans with a limited        number of choices in deductibles, co-pays, and premium with        health savings accounts.)    -   At-risk member identification: By virtue of PHRA, Sarah has        already been identified as an at-risk member who can benefit        from intervention. PHRA lists diabetes as a major risk factor        given her current conditions, BMI, and lifestyle parameters        inferred from external consumer behavior data obtained from        Experian for a specific purpose of improving health guidance,        not premium setting. Given her status, she gets a VAT call        tailored to her situation, along with a two-page feedback/action        plan letter based on her responses to the PHRA questionnaire all        during the first week as part of a welcoming package. The        Integrated Health Management Platform supports this function        with health trajectory predictors, intervention opportunity        finder; and RAM. (With prior art, since Sarah is a new member,        GH must wait for claims data to accumulate before running a        predictive model that predicts 12-month future cost. Because of        claims lag, the typical wait time is 6 months.)    -   Maintenance: Based on earlier communications, Sarah understands        what to do. She takes PHRA frequently to report her progress and        to see if her health scores are improving. Upon meeting her        first goal of losing 10 lbs in 4 weeks and improving her health        scores by 10%, GH sends her a USB pedometer. Now she uses it to        keep track of her activity level daily, uploading to her        personal Web portal at GH activity data, which provides        additional data points to the IHM Platform in order to improve        guidance for Sarah. Meanwhile the IHM Platform is exploring        healthcare database autonomously, looking for patterns that        precede low-to-high or high-to-low transitions so that it can        update its knowledge database. Furthermore, it is constantly        monitoring the relationship between intervention and outcomes to        ensure that every member gets the best possible touch points to        maximize population health using both high-tech and human        interventions. The multimode health-trajectory predictors        perform predictions both on a regular basis and asynchronously        (event-driven). All IHM components work seamlessly to make this        happen. (With prior art, not knowing the full extent of her risk        factors, she may live her life as she normally does. One day,        she feels chest pain and goes to ER. Upon examination, they find        out that she needs heart bypass. Further blood test shows her        blood glucose level at 175 mg/dl, which makes her a diabetic,        further complicating her recovery. About 3 months after her        bypass surgery, GH finally has her claims data in an electronic        data warehouse. The indigenous PM now flags her as a high-risk        member—a clear case of regression to the mean and fixing the        door after a cow has already left. A nurse calls her to inquire        if anything can be done to help her.)

Computer Implementation

FIG. 7 shows computer system 1 that supports an integrated healthmanagement platform (e.g., IHM platform 100 as shown in FIG. 1) inaccordance with an embodiment of the invention. Elements of the presentinvention may be implemented with computer systems, such as the system1. Computer system 1 includes a central processor 10, a system memory 12and a system bus 14 that couples various system components including thesystem memory 12 to the central processor unit 10. System bus 14 may beany of several types of bus structures including a memory bus or memorycontroller, a peripheral bus, and a local bus using any of a variety ofbus architectures. The structure of system memory 12 is well known tothose skilled in the art and may include a basic input/output system(BIOS) stored in a read only memory (ROM) and one or more programmodules such as operating systems, application programs and program datastored in random access memory (RAM).

Computer 1 may also include a variety of interface units and drives forreading and writing data. In particular, computer 1 includes a hard diskinterface 16 and a removable memory interface 20 respectively coupling ahard disk drive 18 and a removable memory drive 22 to system bus 14.Examples of removable memory drives include magnetic disk drives andoptical disk drives. The drives and their associated computer-readablemedia, such as a floppy disk 24 provide nonvolatile storage of computerreadable instructions, data structures, program modules and other datafor computer 1. A single hard disk drive 18 and a single removablememory drive 22 are shown for illustration purposes only and with theunderstanding that computer 1 may include several of such drives.Furthermore, computer 1 may include drives for interfacing with othertypes of computer readable media.

A user can interact with computer 1 with a variety of input devices.FIG. 7 shows a serial port interface 26 coupling a keyboard 28 and apointing device 30 to system bus 14. Pointing device 28 may beimplemented with a mouse, track ball, pen device, or similar device. Ofcourse one or more other input devices (not shown) such as a joystick,game pad, satellite dish, scanner, touch sensitive screen or the likemay be connected to computer 1.

Computer 1 may include additional interfaces for connecting devices tosystem bus 14. FIG. 7 shows a universal serial bus (USB) interface 32coupling a video or digital camera 34 to system bus 14. An IEEE 1394interface 36 may be used to couple additional devices to computer 1.Furthermore, interface 36 may configured to operate with particularmanufacture interfaces such as FireWire developed by Apple Computer andi.Link developed by Sony. Input devices may also be coupled to systembus 114 through a parallel port, a game port, a PCI board or any otherinterface used to couple and input device to a computer.

Computer 1 also includes a video adapter 40 coupling a display device 42to system bus 14. Display device 42 may include a cathode ray tube(CRT), liquid crystal display (LCD), field emission display (FED),plasma display or any other device that produces an image that isviewable by the user. Additional output devices, such as a printingdevice (not shown), may be connected to computer 1.

Sound can be recorded and reproduced with a microphone 44 and a speaker66. A sound card 48 may be used to couple microphone 44 and speaker 46to system bus 14. One skilled in the art will appreciate that the deviceconnections shown in FIG. 7 are for illustration purposes only and thatseveral of the peripheral devices could be coupled to system bus 14 viaalternative interfaces. For example, video camera 34 could be connectedto IEEE 1394 interface 36 and pointing device 30 could be connected toUSB interface 32.

Computer 1 can operate in a networked environment using logicalconnections to one or more remote computers or other devices, such as aserver, a router, a network personal computer, a peer device or othercommon network node, a wireless telephone or wireless personal digitalassistant. Computer 1 includes a network interface 50 that couplessystem bus 14 to a local area network (LAN) 52. Networking environmentsare commonplace in offices, enterprise-wide computer networks and homecomputer systems.

A wide area network (WAN) 54, such as the Internet, can also be accessedby computer 1. FIG. 7 shows a modem unit 56 connected to serial portinterface 26 and to WAN 54. Modem unit 56 may be located within orexternal to computer 1 and may be any type of conventional modem such asa cable modem or a satellite modem. LAN 52 may also be used to connectto WAN 54. FIG. 7 shows a router 58 that may connect LAN 52 to WAN 54 ina conventional manner.

It will be appreciated that the network connections shown are exemplaryand other ways of establishing a communications link between thecomputers can be used. The existence of any of various well-knownprotocols, such as TCP/IP, Frame Relay, Ethernet, FTP, HTTP and thelike, is presumed, and computer 1 can be operated in a client-serverconfiguration to permit a user to retrieve web pages from a web-basedserver. Furthermore, any of various conventional web browsers can beused to display and manipulate data on web pages.

The operation of computer 1 can be controlled by a variety of differentprogram modules. Examples of program modules are routines, programs,objects, components, and data structures that perform particular tasksor implement particular abstract data types. The present invention mayalso be practiced with other computer system configurations, includinghand-held devices, multiprocessor systems, microprocessor-based orprogrammable consumer electronics, network PCS, minicomputers, mainframecomputers, personal digital assistants and the like. Furthermore, theinvention may also be practiced in distributed computing environmentswhere tasks are performed by remote processing devices that are linkedthrough a communications network. In a distributed computingenvironment, program modules may be located in both local and remotememory storage devices.

In an embodiment of the invention, central processor unit 10 determineshealth trajectory predictors from HRA data 151, claims data 153, and CBMdata 155 (as shown in FIG. 1), which are obtained through LAN 152 andWAN 154. Central processor unit 10 may also provide the functionalitiesof intervention opportunity finder 103, resource allocation manager 105,and impact analysis engine 107. Consequently, central processor unit 10may provide a target of opportunity for a consumer from evidence-basedmedicine (EBM) guidelines or medical journals 351 (as shown in FIG.351). EBM guidelines (corresponding to EBM database 317) and electronichealth records (EHR) (corresponding to EHR database 203) may beretrieved from hard disk drive 18.

As can be appreciated by one skilled in the art, a computer system withan associated computer-readable medium containing instructions forcontrolling the computer system may be utilized to implement theexemplary embodiments that are disclosed herein. The computer system mayinclude at least one computer such as a microprocessor, a cluster ofmicroprocessors, a mainframe, and networked workstations.

While the invention has been described with respect to specific examplesincluding presently preferred modes of carrying out the invention, thoseskilled in the art will appreciate that there are numerous variationsand permutations of the above described systems and techniques that fallwithin the spirit and scope of the invention as set forth in theappended claims.

1. An apparatus for processing medical literature comprising: a textmining module detecting an outcome study for a medical treatment from amedical publication; a rule induction module mapping the medicaltreatment to a diagnostic and procedural code; a data analyzerconfigured to perform: (a) accessing a database for health data usingthe diagnostic and procedural code, the health data spanning previoustreatment for a population of consumers; (b) associating an outcomemetric for the medical treatment with a consumer group; and (c)repeating (a) and (b) to determine another outcome metric for anotherconsumer group; and a utility generator generating a utility functionfrom the plurality of outcome metrics, the utility function gauging anefficacy of at least one intervention channel for a consumer.
 2. Theapparatus of claim 1, the text mining module detecting another outcomestudy for another medical treatment.
 3. The apparatus of claim 1, thedata analyzer determining the consumer group by clustering a subset ofthe population from at least one attribute of the population to form acluster.
 4. The apparatus of claim 3, the utility generator determiningthe utility function that relates a utility score for a selectedintervention channel as applied to the cluster.
 5. The apparatus ofclaim 4, the utility generator determining another utility score foranother intervention channel as applied to another cluster.
 6. Theapparatus of claim 1, further comprising: a confirmation interfacepresenting an abstract of the medical publication to a user andreceiving, from the user, a notification that includes a confirmation ofthe abstract.
 7. The apparatus of claim 6, the confirmation interfaceediting the abstract that is presented to the user.
 8. The apparatus ofclaim 3, the data analyzer comparing a cluster dispersion measure of thecluster to a overall dispersion measure of the population and acceptingthe cluster from a ratio of the overall dispersion measure to thecluster dispersion measure.
 9. The apparatus of claim 8, the dataanalyzer comparing the cluster dispersion measure and overall dispersionmeasure for each outcome distribution.
 10. A method for processingmedical literature comprising: (a) detecting an outcome study for amedical treatment from a medical publication; (b) mapping the medicaltreatment to a diagnostic and procedural code; (c) accessing a databasefor health data using the diagnostic and procedural code, the healthdata spanning previous treatment for a population of consumers; (d)associating an outcome metric for the medical treatment with a consumergroup; and (e) repeating (c) and (d) to determine another outcome metricfor another consumer group; and (f) generating a utility function fromthe plurality of outcome metrics, the utility function gauging anefficacy of at least one intervention channel for a consumer.
 11. Themethod of claim 10, further comprising: (g) detecting another outcomestudy for another medical treatment.
 12. The method of claim 10, furthercomprising: (g) determining the consumer group by clustering a subset ofthe population from at least one attribute of the population to form acluster.
 13. The method of claim 12, further comprising: (h) determiningthe utility function that relates a utility score for a selectedintervention channel as applied to the cluster.
 14. The method of claim13, further comprising: (i) determining another utility score foranother intervention channel as applied to another cluster.
 15. Themethod of claim 10, further comprising: (g) presenting an abstract ofthe medical publication to a user receiving, from the user, anotification that includes a confirmation of the abstract.
 16. Themethod of claim 15, further comprising: (h) editing the abstract that ispresented to the user.
 17. The method of claim 12, further comprising:(h) comparing a cluster dispersion measure of the cluster to a overalldispersion measure of the population; and (i) accepting the cluster froma ratio of the overall dispersion measure to the cluster dispersionmeasure.
 18. The method of claim 17, further comprising: (j) comparingthe cluster dispersion measure and overall dispersion measure for eachoutcome distribution.
 19. A computer-readable medium havingcomputer-executable instructions to perform: (a) detecting an outcomestudy for a medical treatment from a medical publication; (b) mappingthe medical treatment to a diagnostic and procedural code; (c) accessinga database for health data using the diagnostic and procedural code, thehealth data spanning previous treatment for a population of consumers;(d) associating an outcome metric for the medical treatment with aconsumer group; and (e) repeating (c) and (d) to determine anotheroutcome metric for another consumer group; and (f) generating a utilityfunction from the plurality of outcome metrics, the utility functiongauging an efficacy of at least one intervention channel for a consumer.20. The computer-readable medium of claim 19, further configured toperform: (g) determining the consumer group by clustering a subset ofthe population from at least one attribute of the population to form acluster.
 21. The computer-readable medium of claim 20, furtherconfigured to perform: (h) determining the utility function that relatesa utility score for a selected intervention channel as applied to thecluster.
 22. The computer-readable medium of claim 20, furtherconfigured to perform: (h) comparing a cluster dispersion measure of thecluster to a overall dispersion measure of the population; and (i)accepting the cluster from a ratio of the overall dispersion measure tothe cluster dispersion measure.